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Grades using 2018 - 19 Data 


Abstract 


Curriculum Associates’ i-Ready® Instruction is a supplemental, online personalized instruction 
program available for reading and mathematics?. The Human Resources Research 
Organization (HumMRRO), in collaboration with Century Analytics, implemented a quasi- 
experimental design (QED) using 2018-19 i-Ready Diagnostic and Instruction data to evaluate 
the impact of Curriculum Associates’ reading /-Ready Instruction on student reading 
achievement at grades 6-8. We hypothesized student achievement, as measured by the /- 
Ready® Diagnostic, would be higher for students using i-Ready Instruction for reading over a 
comparison group of students who did not use this instruction. We conducted matching to 
identify a set of comparison students demographically similar to our i-Ready Instruction 
treatment students for each grade level. First, we stratified our sample by gender, English 
learner status, disability status, and economic disadvantage status. Next, we used propensity 
score matching to identify analytic samples of i-Ready Instruction and comparison students 
matched on baseline reading student achievement. Students who received the /-Ready 
Instruction and students in the comparison group were administered the reading i-Ready 
Diagnostic assessments. To evaluate impact, hierarchical-linear modeling (HLM) was 
conducted separately for each analytic sample with students at level 1 and school at level 2. 
Results suggest students using /-Ready Instruction with fidelity performed statistically 
significantly better on reading performance than students in grades 6—8 who did not use this 
instruction. The effect sizes fall within the range for which recent research by Kraft (2019) has 
found is typical of education interventions. These findings provide support that, when used with 
fidelity, student use of /-Ready Instruction for reading is tied to higher student reading 
achievement. 


Introduction 


Founded in 1969, Curriculum Associates provides a variety of educational products and 
services with the goal of improving education for students and teachers. Two Curriculum 
Associates products include i-Ready® Diagnostic (available for K-12) and i-Ready® Instruction 
(available for K-8). The i-Ready Diagnostic assessments (a) are online, computer-adaptive 
assessments that pinpoint student needs at the sub-skill level and (b) help monitor the extent to 
which students are on track to achieve end-of-year targets. The i-Ready Diagnostic 
assessments are independent measures often used by educators as classroom benchmark 
assessments. They can be used with or without i-Ready Instruction. We provide additional 
information on the validity and reliability of the i-Ready Diagnostic as a measure of student 
achievement in our methodology discussion below. i-Ready Instruction is a supplemental 
program that provides online, individualized instruction adjusted to student needs. 


The Human Resources Research Organization (HumRRO) is an independent research 
organization that specializes in program evaluation and quantitative methodology. Century 
Analytics is a small business with various education research expertise including quasi- 
experimental design and What Works Clearinghouse (WWC) standards. 


1 https:/Awww.curriculumassociates.com/products/i-ready 
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HumRRO and Century Analytics conducted an evaluation to examine the impact of i-Ready 
Instruction on reading achievement for students in middle school grades 6—8 using 2018—19 
data. This was one in a series of evaluations examining the impact of Curriculum Associates’ 
interventions on student achievement. This study was designed to meet the required rigor of the 
WWC 4.0 standards to achieve a rating of Meets WWC Group Design Standards with 
Reservations (WWC, 2017a), and to meet guidelines for a Level 2 (or Moderate) rating for the 
Every Student Succeeds Act (ESSA) guidance for evidence-based research (U.S. Department 
of Education, 2016). To accomplish this, we used a quasi-experimental design (QED), 
established baseline equivalence between the treatment and comparison groups, included 
baseline achievement as a covariate, and used a sampling design that mitigates the effects of 
any confounding factors. 


There were key differences between this study and past studies. Specifically, previous studies 
considered school as the unit of i-Ready Instruction assignment, whereas this study considered 
students as the unit of assignment. This change in unit of assignment acknowledges the 
inherent flexibility of -Ready Instruction implementation. For example, some schools may 
implement at the school-level, the grade-level, or the classroom-level, while other schools may 
implement /-Ready Instruction at the individual student-level so they can target specific groups 
of students. In addition, our past studies included only schools using i-Ready Diagnostic and 
Instruction, or i-Ready Diagnostic only for the comparison group, with general education 
students. Thus, those schools using /-Ready Diagnostic (with or without Instruction) with select 
subsets of students were removed from our sample. Because our data support various types of 
implementation occurring across schools, and we understand it is Curriculum Associates intent 
that these different implementations are valid uses, this study includes students from schools 
that are implementing i-Ready Diagnostic with or without /nstruction in a variety of ways. 


Defining i-Ready Instruction 


The impact of /-Ready Instruction on student achievement was the focus of this evaluation. /- 
Ready Instruction is an online personalized instruction program aligned to college- and career- 
ready standards that includes engaging multimedia instruction and progress monitoring into 
online lessons. Lessons are intended to provide a consistent best-practice lesson structure and 
build students’ conceptual understanding. /-Ready Instruction is intended to be used in 
conjunction with i-Ready Diagnostic which monitors student progress and identifies student 
performance in reading. This diagnostic information helps target student-specific intervention, 
which can be provided through /-Ready Instruction. 


Curriculum Associates developed a Theory of Action (TOA) that features the key 
implementation components of i-Ready Instruction, the intended intermediate outcomes, and 
the intended long-term outcomes. The key implementation components highlight actions 
recommended by students, teachers, and leaders to obtain the long-term outcome of improved 
student learning in reading and mathematics. Among others, the key components include 
support at the school and district leadership levels, monitoring of student progress by teachers, 
and student use of i-Ready Instruction to work through a personalized, scaffolded instruction 
path. The /-Ready Instruction TOA is provided in Appendix A. 


Curriculum Associates provides guidance to districts and schools on how to implement /-Ready 
Instruction to best benefit student learning (Curriculum Associates, 2019). Guidance indicates 
students achieve greater gains when using /-Ready Instruction for an average of at least 30 
minutes per week, per subject area. In addition, Curriculum Associates recommends use for 12 
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to 18 calendar weeks between two administrations of the i-Ready Diagnostic (Curriculum 
Associates, 2018). 


Research Questions 


The purpose of this study was to determine the impact of i-Ready Instruction on student 
achievement in reading. We examined the following key research question separately for each 
grade 6-8 of our study: 


Do students who use /-Ready Instruction for reading have higher reading achievement 
as measured by the i-Ready Diagnostic than students who use /-Ready Diagnostic only? 


We hypothesized that student achievement for reading would be higher for students who used /- 
Ready Instruction with fidelity, based on the criteria described in the TOA and user guidance 
(Curriculum Associates, 2019). Our hypothesis was based on the belief that students benefit 
from the i-Ready Instruction targeted to their specific needs in reading. 


Methodology 


In this section, we describe the methodology for conducting our impact analysis. We begin with 
initial design decisions. We then discuss the student selection and matching process, as well as 
our analytic model and examination of baseline equivalence. Finally, we discuss our impact 
analysis results. 


Initial Design Decisions 
Cluster-Level Design 


We used the student as the unit of assignment for this study to acknowledge the flexibility 
intended by /-Ready Instruction and to include students from schools with various 
implementation types. Matching was conducted at the student-level and, thus, the analytic 
model examined the outcome at the student level. However, we also considered potential 
influence of school-level factors and thus decided to include a two-level analytic model with 
school characteristics at level 2 and students at level 1. 


Baseline and Outcome Measure 


We selected the i-Ready Diagnostic as both the baseline and outcome measure for all students 
participating in this study (i.e., -Ready Instruction students and comparison group students). /- 
Ready Diagnostic for reading measures achievement aligned to common reading content and 
skills with demonstrated test score reliability. Marginal reliabilities are 0.97 and test-retest 
reliabilities range from 0.85 to 0.86 for reading in grades 6—8. Therefore, this assessment meets 
the WWC 4.0 standards for an acceptable baseline and outcome measure (WWC, 2017a). 


The i-Ready Diagnostic assessments align to college- and career-ready standards so that 
results can inform student placement decisions, offer explicit instructional advice, and prescribe 
resources for targeted instruction and intervention. The assessments are used by some schools 
and districts in conjunction with i-Ready Instruction and by others as a stand-alone diagnostic 
assessment without the use of i-Ready Instruction. The i-Ready Diagnostic assessments for 
mathematics and reading are currently used by more than 6.5 million students across the United 
States. Thus, the use of i-Ready Diagnostic as the outcome measure allowed us to include a 
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large sample of students from across the United States. The /-Ready Diagnostic is intended to 
be administered in a standardized manner across schools (Curriculum Associates, 2019b). 
Specifically, teachers are to schedule the first (fall) Diagnostic 2-3 weeks into the school year in 
two 45—50-minute sessions. Teachers also are encouraged to test technology to ensure proper 
function and have pencils and paper available as scratch paper. Test administrators provide 
instructions to their students and motivate them to do their best. Teachers monitor students as 
they complete the assessments. 


Multiple studies have been conducted to support the reliability and validity of the reading /- 
Ready Diagnostic as well as its consistency with education standards used across the United 
States. Since being released in summer 2011, i-Ready Diagnostic has been reviewed and 
approved at the state level as an assessment, instructional resource, or intervention in Arizona, 
California, Colorado, Connecticut, Delaware, Florida, Georgia, Idaho, Indiana, Massachusetts, 
Mississippi, Nevada, New Mexico, New York, North Carolina, Ohio, Oklahoma, Oregon, 
Tennessee, Utah, and Virginia. 


Curriculum Associates has conducted multiple linking studies examining i-Ready Diagnostic 
scores for reading at grades 3-8 that provide evidence the /-Ready Diagnostic measures skills 
consistent with student expectations and can be used as a student reading achievement 
measure. For example, a study using 2016 data examined the correlation between /-Ready 
Diagnostic and the Smarter Balanced summative assessments, the Partnership for Assessment 
of Readiness for College and Careers (PARCC), and state testing programs in Florida, Georgia, 
Indiana, Michigan, Mississippi, New York, North Carolina, Ohio, and Tennessee. These studies 
show strong correlations between /-Ready Diagnostic scores and scores on these national and 
state tests. The average correlations across grades between the /-Ready Diagnostic for reading 
ranged from 0.78 (Tennessee TNReady) and 0.85 (Smarter Balanced). These studies also 
provide evidence that the i-Ready Diagnostic content is highly consistent with what students 
across the United States are expected to learn (Curriculum Associates, 2019). Curriculum 
Associates recently completed linking studies for Colorado, Kentucky, and Missouri. In addition, 
Curriculum Associates has commissioned Odell Education and others to complete alignment 
studies to demonstrate the degree of alignment between the content on i-Ready Diagnostic and 
current sets of state standards. Specifically, they have conducted alignment studies for the 
Common Core State Standards (CCSS), and for the Florida, Indiana, Louisiana, Michigan, Ohio, 
and South Carolina state standards. 


Required Number of Students 


We conducted power analyses using Optimal Design software (Spybrook et al., 2011) to identify 
the total number of students required at each grade level to reject the null hypothesis that there 
is no difference in student reading achievement between the treatment and comparison group. 
Statistical power is influenced by various factors. We used data from previous studies HumRRO 
conducted using /-Ready Diagnostic as an outcome to estimate conservative and optimistic 
parameters for use in the power analysis. These parameters were: (a) 0.90 for the relationship 
between the baseline and outcome variable, (b) 40 and 60 for the number of students per 
school, and (c) 0.10 and 0.30 for the intraclass correlation coefficient (ICC). Results of the 
power analyses indicated sample sizes of a minimum of 400 students would be sufficient to 
reach our desired statistical power of 0.80. This level of statistical power provides an 80% 
chance of detecting a statistically significant difference with 95% confidence, if one exists. Our 
student samples across all grades far exceeded the minimum. 
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Analytic Model 


Our model for the impact analyses incorporated student- and school-variables. The baseline 
difference model used to estimate baseline equivalence for our matched sample was based on 
the impact model. As previously discussed, we chose a two-level model with level 1 as the 
student and level 2 as school. 


Impact Model 


We used HLM to estimate the impact of i-Ready Instruction on student achievement. We 
included the following student-level covariates in each analysis: 


e Group membership (0 = comparison; 1 = treatment) 
e ji-Ready Diagnostic reading baseline performance (grand mean centered) 


e Blocking variables (i.e., dummy codes) to account for strata used in matching (described 
in the matching section of this report) 


Although we considered the student to be our unit of assignment, with the understanding that 
many schools intentionally do not use /-Ready Instruction with all students, we also wanted to 
capture and control for potential school-level factors. We were especially interested in 
identifying variables that would provide unique information from the student-level variables. We 
used the following school-level covariates in each analysis: 


e Traditional school indicator (0 = 6-8 structure; 1 = other) 

e Location (town, suburban, rural, city) 

e Charter/magnet school indicator (0 = not charter or magnet; 1 = charter or magnet) 
e Percent white students 

e Percent of students eligible for free and reduced price lunch (FRL) 


Our Level 1 model described the relationship between student outcomes, student-level 
characteristics, the baseline covariate, and the strata used for matching. This model level also 
included the treatment indicator. We specified level 1 of the model as follows: 


Yij = BOs + B1j(GROUPY) + B2(PRE/—PRE..) + ZBq(STRATAW) + ef 


Where Yij is the outcome for student / in school j. BO/ is the adjusted mean outcome for 
comparison students in school j. B1/ is the adjusted mean difference in outcome due to the 
student’s group membership (i.e., the treatment effect), and GROUP is an indicator variable 
coded 1 for students in the i-Ready Instruction group and 0 for students in the comparison 
group. B2j is the adjusted difference in outcome due to the student’s baseline achievement 
score (grand mean centered). Bq is a vector of blocking variables to account for the strata used 
in matching. ey is the random error in the achievement outcome associated with student / in 
school j not accounted for in the model. 


We specified level 2 of the model as follows: 


BOj = yOO + yO1(STRUCTURE)) + yO2(CHARTER)) + yO3(PERWHITE;) + y04(PERFRL)) 
+ Zyk(LOCATION)) + u0j 
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B1j = y10 
B2j = y20 
ZBp = ypO 
ZBq = yqO 


Where y00 is the grand mean. y01 is added to control for school grade-level structure where 
STRUCTURE Is coded as 0 for schools with a typical grade level structure (6—8 for middle 
school) and 1 for schools with an atypical grade structure. y02 is the additive effect for charter or 
magnet schools. y03 and y04 are added to control for school characteristics of percent white 
and percent FRL, respectively. Zyk is a vector of three dummy variables to control for school 
location. uOj is the random error in the achievement outcome associated with school j. The 
regression slopes for the treatment, student baseline achievement, student demographics and 
strata are fixed across schools. 


Baseline Difference Model 


We used the model below to estimate the baseline difference between students in the treatment 
group and the comparison group. This model follows the same structure as the impact analysis 
model but excludes covariates. 


We specified level 1 of the model as: 
Yij = BOj + B1s(GROUPY) + ZBq(STRATAY) + ej 


Where Yij is the baseline for student / in school j. BO/ is the adjusted mean outcome for 
comparison students in school j. B1/ is the adjusted difference in outcome due to the student’s 
study group membership (i.e., the baseline difference), and GROUP is an indicator variable 
coded 1 for students in the i-Ready Instruction group and 0 for students in the comparison 
group. Bq is a vector of blocking variables to account for the strata used in matching. eij is the 
random error in the achievement outcome associated with student / in school / not accounted for 
in the model. 


We specified level 2 of the model as: 
BOj = yOO + u0/ 
B1j = y10 
2Bq = yq0 


Identifying a Student Sample 
Defining Eligibility 


For each grade level, we started with a student-level i-Ready usage file of reading i-Ready 
Diagnostic and i-Ready Instruction use in 2018-19 for students who had at a minimum fall and 
spring i-Ready Diagnostic scores. We next filtered to include only public-school students, which 
included traditional public schools and public charter and magnet schools. This ensured we 
were including only students in a relatively traditional school environment with expectations to 
follow state adopted college and career ready standards. 


We also filtered our sample based on availability of student level demographic variables that 
were identified for inclusion in matching and the impact analysis model. Only students with 
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available demographic data for (a) gender, (b) English learner (EL) status, (c) special education 
status, and (d) economic disadvantage status were included. We conducted data checks prior to 
removing schools that indicated students with available demographic data were not different on 
academic achievement, as measured by the /-Ready Diagnostic, than those who did not have 
demographic data. These checks provided assurance that data were missing at random. 
However, we also note that users of the /-Ready products tend to be of higher percentage 
minority and low income schools compared to all United States schools; thus, though we were 
confident our student sample used for matching was academically representative of the public 
school students using i-Ready Diagnostic or i-Ready Diagnostic and Instruction, we do not 
expect they are representative of all students in the United States. 


In addition, for a student to be eligible for the treatment group, they must have used /-Ready 
Instruction for reading a minimum of 18 distinct weeks for an average of at least 30 minutes per 
week (Curriculum Associates, 2018). This was consistent with guidance on the minimum /- 
Ready Instruction usage at the student level for attaining intended goals of improved student 
reading achievement. These students also needed to have attended a school that began using 
i-Ready Instruction to some extent prior to the 2018-19 school year. This requirement is based 
on the understanding that i-Ready Instruction implementation requires a start-up time to learn 
the technology and adjustments to scheduling before i-Ready Instruction is fully up and running. 
To be eligible for the comparison group, students must not have used any /-Ready Instruction 
for reading in 2018-19. We removed students not meeting the treatment or comparison 
eligibility requirements from the datafile used in matching. 


For all middle school grades, between 20 to 25% of schools had students assigned to the 
treatment group and students assigned to the comparison group. Though we expected some 
overlap, this proved problematic for achieving baseline equivalence. The was likely because 
these schools were not assigning students randomly to receive either i/-Ready Diagnostic and 
Instruction or i-Ready Diagnostic only. Rather, the data suggested that lower achieving students 
were being assigned to receive i-Ready Diagnostic and Instruction, and the higher performing 
students were using /-Ready Diagnostic only. By making such assignments at the school level 
and including schools as a level in the predetermined baseline difference model, comparisons 
within schools resulted in two groups dissimilar on baseline achievement. For the purpose of 
this study, we eliminated all schools that included students in both groups. 


Matching 


We conducted matching at the student level using a multi-step process. Matching was 
conducted separately by grade (6—8). Thus, we conducted each matching step three separate 
times to identify three analytic samples (i.e., three grades). 


First, we stratified our sample by gender, EL status, special education status, and economic 
disadvantage status. This assured that students were only matched to students with identical 
demographic characteristics on these four variables. The variables were selected because they 
are known to be related to student achievement (Hanover Research, 2014; van Langen, Bosker, 
& Dekkers, 2006) and were available through the i-Ready usage datafiles. This stratification 
resulted in 16 strata at each grade. Each stratum contained treatment and comparison students. 
In some strata, the treatment group was larger than the comparison group, or vice versa. Within 
each stratum, we used logistic regression to compute a propensity score for each student (Guo 
& Fraser, 2010). The propensity scores predicted the chance a student belonged to the group 
(treatment or comparison) with the smallest number of students, indicated by a value ranging 
between 0 and 1, based on the fall /-Ready Diagnostic scores. We used the propensity scores 


Impact Evaluation of Reading i-Ready Instruction for Middle School Grades using 2018 — 19 Data 7 


PS HuMRRO 


to match each student from the smallest group (treatment or comparison) to a student from the 
largest group. We matched using the nearest neighbor method without replacement (Stuart, 
2010). Once matching was conducted for all strata within a grade, we combined the data from 
all strata into one analytic sample. 


Following specification of our analytic and baseline difference models, we removed an average 
of 3.3% of students across the three analytic samples who had incomplete data on the school- 
level variables included in the impact model. This resulted in unequal numbers of students in 
comparison and treatment groups. Figure 1 summarizes the demographic makeup of the final 
set of students in each analytic sample. The counts of students included in each group can be 
found in Table 1 on page 11. As shown, the stratification process used in matching ensured the 
i-Ready Instruction and comparison groups were highly similar on the key demographic 
variables, despite the need to remove a small percentage of the sample to account for missing 
school-level variables. 
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Figure 1. Demographic makeup of final matched i-Ready Instruction and comparison samples 
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Although our sampling focused on the student-level, to gain additional understanding of where 
our student-sample was from, we examined the distribution of students across urbanicity 
categories, as defined through school-level variables of the National Center for Education 
Statistics (NCES) publicly available database. Figure 2 shows that schools in the i-Ready 
Instruction and comparison groups share a relatively similar urbanicity distribution. 
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Figure 2. Students’ school urbanicity for final matched i-Ready Instruction and 
comparison samples 


Baseline Equivalence 


Once our analytic samples were identified, we used our baseline difference model to estimate 
the adjusted mean differences between our i-Ready Instruction and comparison groups of 
students at each grade level. We converted the estimated baseline difference between students 
in the two groups to an effect size to evaluate baseline equivalence for each of the three 
analytic samples. For all three samples, Hedges’ g was much smaller than the WWC required 
threshold of 0.25 (see Table 1), so we determined the groups were baseline equivalent (WWC, 
2017b). 
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Table 1. Reading Baseline Equivalence Statistics for i-Ready Instruction (Treatment) and 
ileal Groups by Grade 


i-Ready i-Ready “Adj Mean Diff Effect 


i-Ready Instruction 9, 972 564. 82 51. 31 -3. 32 -0.06 
Comparison 9,793 567.90 53.43 (2.12) 

7 i-Ready Instruction 7,736 567.89 56.00 -4.28 -0.08 
Comparison 7,604 572.17 53.49 (3.06) 

8 i-Ready Instruction 5,344 574.39 58.41 -6.44 -0.11 
Comparison 5,333 580.83 54.94 (3.30) 


Notes: SD = standard deviation of i-Ready scores, Adj Mean Diff = adjusted mean difference 
between /-Ready Instruction and comparison groups, and Effect Size = Hedge’s g. 


Impact Analysis Results 


After confirming our matched samples were baseline equivalent at each grade, we estimated 
the impact of i-Ready Instruction on student achievement using the analytic model described 
above, with spring 2019 i-Ready Diagnostic scores as the outcome. Analyses were conducted 
separately for each grade. This section describes the results of the analysis. Full information on 
the model results, including student- and school-level covariate parameters, are presented in 
Appendix B. 


In addition to estimating the impact of i-Ready Instruction, we also examined three model 
assumptions associated with two-level HLM—residual normality, independence, and 
homoscedasticity—using the MIXED_DX macro in SAS (Bell, Smiley, Ene, & Blue, 2014). No 
major violations were found. Additional details regarding the assumption checks are available in 
Appendix C. 


Table 2 contains the impact model results by grade for reading spring i-Ready Diagnostic 
scores. For all grade levels, the adjusted mean differences were positive, indicating the i-Ready 
Instruction group earned higher scores than the matched comparison group. All mean 
differences were statistically significant (a = .05) with Hedge’s g effect sizes ranging from 0.05 
to 0.09. These effect sizes are promising for an education intervention. Though traditional 
guidance has suggested these effect sizes are small (Lipsey et al. 2012), recent research by 
Kraft (2019) notes traditional guidelines, including those reported by Lipsey, are often too rigid 
for the realities of education interventions. He specifies effect size ranges of 0.03-0.17 as 
typical of education interventions and that these often represent a meaningful effect. He 
suggests effect sizes should be considered in conjunction with all aspects of an intervention, 
including the magnitude of the treatment contrast and costs. 


Table 2 also provides the intra-class correlations (ICCs) by grade. The ICCs measure the 
proportion of the variance between schools—that is, how much of the variance in reading /- 
Ready Diagnostic scores can be explained by school-level differences. The ICCs range from 
0.26 (grade 6) to 0.30 (grade 7). This suggests the majority of variance is due to factors other 
than school-level differences; however, we prefer ICCs to be below 0.20. The elevated ICCs 
may be impacted by the variation in implementation methods and our decision to model 
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implementation at the student level. This finding will assist in future efforts for identifying the 
most appropriate unit of assignment to account for these variations. 
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Table 2. Impact Analysis Results for i-Ready Instruction (Treatment) and Comparison Groups for Reading Student 
Achievement by Grade 


students | Ready | Ready | “yea” Effet 
(SE) 

6 i-Ready Instruction 0.26 9,972 586.80 53.25 2.48 0.042 0.05 
Comparison 9,793 584.32 55.37 (1.22) 

7 i-Ready Instruction 0.30 7,136 589.45 57.29 3.62 0.013 0.06 
Comparison 7,604 585.83 55.35 (1.44) 

8 i-Ready Instruction 0.27 5,344 599.22 58.93 5.27 0.005 0.09 
Comparison 5,333 593.95 55.75 (1.87) 


Notes: ICC = intraclass correlation, SD = standard deviation of i-Ready scores, Adj Mean Diff = adjusted mean difference between 


i-Ready Instruction and comparison groups, SE = standard error of the adjusted mean difference, and Effect Size = Hedge’s g. 
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Summary and Discussion 


At all grades, impact analyses suggest that middle school students who use /-Ready Instruction 
with fidelity have higher achievement in reading when compared to students who did not use /- 
Ready Instruction. At each grade, students in the /-Ready Instruction group had a statistically 
significantly higher reading /-Ready Diagnostic score than did students in a matched 
comparison group. 


The effect sizes provided additional evidence /-Ready Instruction is beneficial for improving 
student reading. Recent research (Kraft, 2019) suggests education interventions typically attain 
effects ranging from 0.03 to 0.17. Our effect sizes for all grades fell within this range. Kraft 
(2019) notes one should consider various factors when interpreting effect sizes, including a 
program’s cost relative to its benefits and the size of the treatment contrast. For example, we 
note that i-Ready Instruction is a supplemental intervention that requires only 12 to 18 weeks of 
30 minutes or more per week during a school year to be considered implemented with fidelity at 
the student level. In addition, because /-Ready Instruction is not a full curriculum and there are 
likely many similarities between what else students are exposed to whether in the i-Ready 
Instruction group or comparison group, we believe the contrast between our treatment and 
comparison group is likely minimal. Similarly, it is possible some students in our comparison 
group were exposed to interventions like /-Ready Instruction. Thus, given the required effort for 
using i-Ready Instruction with fidelity is relatively low, and the contrast between the /-Ready 
Instruction and comparison group small compared to a more involved intervention or curricular 
program, we feel confident that our effect sizes are meaningful. 


Kraft (2019) also points out that the U.S. education system is decentralized, and implementation 
procedures are ultimately controlled by local schools and/or teachers. As a QED, this study did 
not attempt to control for curriculum, supplemental resources, or classroom structure. Students 
in both groups were not participants in a research study but rather they were actual customers 
and everyday users, and /-Ready Instruction was carried out in real-world conditions. We may 
have found even larger effect sizes had the study been conducted under more controlled 
circumstances. Impacts are typically greater for studies that aim for ideal or close to ideal 
implementation and less for studies that examine real-world implementation. Thus, the fact we 
were able to find significant findings for all grade levels despite the lack of controls is promising. 


We conducted this study differently from a past study using 2017-18 data by considering the 
unit of assignment to be the student instead of the school. Additionally, we used 2018—19 data 
to take advantage of the most recent available information. Despite these key differences, our 
results were highly consistent — both studies found the treatment performed better than the 
comparison. This replication provides confidence that students using /-Ready Instruction in 
conjunction with the i-Ready Diagnostic show greater reading achievement compared to a 
comparison group using i-Ready Diagnostic only. 


Our study was conducted as a rigorous QED to meet the current standards described by the 
WWC (WWC, 2017) to achieve a rating of Meets WWC Group Design Standards with 
Reservations. In addition, because we found statistically significant positive effects for all grades, 
this study meets the guidelines set forth by ESSA for a Level 2 (or Moderate) rating for evidence- 
based research (U.S. Department of Education, 2016). 
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Limitations and Implications for Future Studies 


This study provides strong evidence supporting reading /-Ready Instruction use for students. 
Through our long-standing relationship with Curriculum Associates and multiple impact 
evaluations, including the current study, we have developed recommendations for the foci of 
future studies that may provide additional evidence to support the impact of i-Ready Instruction. 


First our ICCs were above 0.20 for all grades, suggesting school differences may be important 
for matching and estimating treatment effects. However, the data also revealed large variations 
in how many students at a given school or grade within a school used /-Ready Instruction with 
fidelity. Future studies may look to explore the grade or classroom as the unit of assignment. 
We also found that schools choosing to implement /-Ready Instruction to select students and /- 
Ready Diagnostic only to others were generally selecting low performing students for the /- 
Ready Diagnostic and Instruction group and other students for the /-Ready Diagnostic only 
group such that baseline equivalence could not be achieved for students within school. We 
recommend Curriculum Associates collect information directly from schools to understand their 
intended implementation so this information can be incorporated into sample selection and 
analytic models. 


Second, we note our study was a QED with the typical limitations, including a lack of information 
on implementation decisions made at each school and within each classroom. We recommend 
randomized control trials (RCTs) in the future even if only a small sample of schools and 
students is included. We also suggest including only one district to allow greater control on 
implementation. 


Finally, our treatment group was compared to a matched comparison group using the /-Ready 
Diagnostic. It is possible that use of i-Ready Diagnostic itself increases student achievement. 
However, the design of this study did not allow for an estimation of that impact. Further, use of 
the i-Ready Diagnostic only schools and students as a comparison group may have attenuated 
the effects of i-Ready Instruction use had this treatment group been compared to a “business- 
as-usual” comparison group. Future studies might examine the impact of /-Ready Instruction 
using a set of comparison schools and students not implementing any Curriculum Associates 
products. This would require an external achievement measure, potentially a state assessment, 
as the baseline and outcome measure. 


Quality Control Procedures 


We employed various quality control checks throughout the data cleaning, analysis, and 
reporting processes. HumRRO, Curriculum Associates, and Century Analytics worked together 
to identify a rigorous methodology based on implementation of i-Ready Instruction with fidelity, 
the WWC 4.0 standards, and ESSA Level 2 guidelines. 


Rules for identifying treatment and comparison groups were determined through collaboration 
between the three study partners. Curriculum Associates provided information on the various 
components of /-Ready Instruction and the frequency for which it should be used for 
implementation with fidelity. They also provided /-Ready Diagnostic and Instruction data to allow 
HumRRO and Century Analytics to empirically examine the extent to which these 
recommendations were followed by /-Ready Instruction schools. These discussions led to 
treatment and comparison group criteria in which all partners were confident. 


Impact Evaluation of Reading i-Ready Instruction for Middle School Grades using 2018 — 19 Data 15 


PB HuMRRO 


Data analysis work was completed collaboratively by HumRRO and Century Analytics. Century 
Analytics and HumRRO independently conducted matching and HLM analyses for each grade. 
The researchers reviewed results against each other and worked out any discrepancies. All 
results reported in this study were verified by researchers from both organizations. 
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Appendix A. /-Ready Instruction Theory of Action 
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MUMAN RESOURCES RESEARCH ORGANIZATION 


; The i-Ready Diagnostic is an adaptive assessment that assesses students on relevant skills in a challenging and engaging way, capturing insight about 
student learning down to the subskill level. Teachers are provided with precise, actionable data and instructional recommendations to more seamlessly 
differentiate dassroom instruction according to their students’ needs, saving teachers valuable time. This allows teachers to deliver more impactful 
instruction to increase student growth and proficiency. 


“~— teeronatoemcton 


The i-Rleady Diagnostic is an adaptive assessment that assesses students 
three times a year on relevant skills in a challenging and engaging way, 

capturing insight about individual student math and/or reading strengths 
and needs down to the sub-skill level. 


The following implementation program components help to maximally 
leverage the +Ready Diagnostic scores for differentiated instruction: 


O Students access their customized /-Ready dashboard to view their data, 
performance, and progress. 


O Teachers attend Professional Development sessions to acquire Ready 
skills and concepts. 


O Teachers ensure that students’ /feady Diagnostic scores are valid and 
reliable by: 
© Adequately preparing students before taking the /Rleady Diagnostic 
Se eee 


© Planning to retest students with abnormal test results (ex: red rush 
fags) 

© Monitoring and observing students during the /-Ready Diagnostic 
administration 


O Teachers access the i-Ready dashboard and reports for: 
© Precise and actionable performance and growth data 
© Gear tools such as student can dos and next steps for instruction, 
inchuding grade placement level: that highlight student needs down 
to the sub-skill level 
© Student groups based on similar instructional needs 
Typical and Stretch Growth values for each student 
Monitoring student growth over time 


O Teachers can display class goals, performance, and progress through data 
walls or other methods. 


OC School and district leaders provide necessary system support: serving 2s 
instructional leaders, supporting teachers with implementation, ersuring 
the required technology & in place, setting appropriate schedules to 
administer the /Ready Diagnostic, clearly communicating those 
administration windows, and accessing reports to view student and class 
Gata to better understand resource needs. 
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Appendix B. Impact HLM Coefficients 


Table B.1. HLM Results for Sixth Grade Reading 


Student-Level Covariates 


Treatment Group Membership 2.48 1.22 2.04 0.042 0.10 4.86 
Fall 2018 Reading i-Ready Grand 
Mean Centered 0.81 0.00 | 168.35 < .001 0.80 0.81 
Student-Level Stratum 
Female, ELL = 0, SpEd = 0, EcDis = 0 15.89 2.45 6.49 < .001 11.09 20.69 
Female, ELL = 1, SpEd = 0, EcDis = 0 13.72 3.09 4.43 < .001 7.65 19.78 
Female, ELL = 0, SpEd = 1, EcDis = 0 4.98 2.71 1.84 0.066 -0.33 10.29 
Female, ELL = 0, SpEd = 0, EcDis = 1 13.24 | 2.40 5.52 < .001 8.53 17.94 
Female, ELL = 1, SpEd = 1, EcDis = 0 -0.26 5,15 -0.05 0.959 -10.36 9.83 
Female, ELL = 0, SpEd = 1, EcDis = 1 1.59 2.67 0.60 0.550 -3.63 6.82 
Female, ELL = 1, SpEd = 0, EcDis = 1 10.68 2.72 3.93 < .001 5,36 16.00 
Female, ELL = 1, SpEd = 1, EcDis = 1 -3.61 4.32 -0.84 0.402 -12.08 4.85 
Male, ELL = 0, SpEd = 0, EcDis = 0 13.95 2.45 5.70 < .001 9.16 18.75 
Male, ELL = 1, SpEd = 0, EcDis = 0 10.06 2.92 3.44 0.001 4.33 15.78 
Male, ELL = 0, SpEd = 1, EcDis = 0 5.43 2.55 2.13 0.033 0.43 10.43 
Male, ELL = 0, SpEd = 0, EcDis = 1 12.30 2.40 5.13 < .001 7.59 17.00 
Male, ELL = 1, SpEd = 1, EcDis = 0 2.22 4.56 0.49 0.627 -6.73 11.16 
Male, ELL = 0, SpEd = 1, EcDis = 1 2.95 2.55 1.16 0.247 -2.04 7.94 
Male, ELL = 1, SpEd = 0, EcDis = 1 10.29 2.67 3.86 < .001 5.06 15.53 
School-Level Covariates 
Charter or Magnet Designation 1.64 1.50 1.09 0.276 -1.30 4.58 
Traditional Middle School 4.57 1.11 4.12 < .001 2.39 6.74 
Percent non-white students -0.04 0.02 -1.44 0.151 -0.08 0.01 
Percent FRL students -0.05 0.03 -1.69 0.091 -0.10 0.01 
Locale — Suburban -1.94 1.84 -1.06 0.290 -5.54 1.66 
Locale — Rural -2.38 1.32 -1.80 0.072 -4.98 0.21 
Locale — City -0.64 | 2.08 -0.31 0.759 -4.72 3.44 
Intercept 


Intercept 574.31 3.10 185.44 < .001 568.24 580.38 
Note. Stratum 16 and Locale — Town were used as reference groups in the model. 
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Table B.2. HLM Results for Seventh Grade Reading 


Student-Level Covariates 


Treatment Group Membership 3.62 1.44 2.51 0.012 0.79 6.44 
Fall 2018 Reading i-Ready Grand 
Mean Centered 0.79 0.01 | 138.76 < .001 0.78 0.80 


Student-Level Stratum 
Female, ELL = 0, SpEd = 0, EcDis = 0 17.36 3.13 5.55 < .001 11.23 23.48 


Female, ELL = 1, SpEd = 0, EcDis = 0 13.98 3.64 3.84 < .001 6.84 21.11 
Female, ELL = 0, SpEd = 1, EcDis = 0 7.77 3.41 2.28 0.023 1.09 14.44 
Female, ELL = 0, SpEd = 0, EcDis = 1 13.63 3.08 4.43 < .001 7.60 19.65 
Female, ELL = 1, SpEd = 1, EcDis = 0 14.73 6.18 2.38 0.017 2.62 26.84 
Female, ELL = 0, SpEd = 1, EcDis = 1 3.82 3.31 1.15 0.248 -2.67 10.31 
Female, ELL = 1, SpEd = 0, EcDis = 1 12.82 3.36 3.82 < .001 6.24 19.40 
Female, ELL = 1, SpEd = 1, EcDis = 1 -1.05 5.14 -0.20 0.838 -11.14 9.03 
Male, ELL = 0, SpEd = 0, EcDis = 0 15.61 3.12 5.00 < .001 9.49 21.73 
Male, ELL = 1, SpEd = 0, EcDis = 0 13.89 3.64 3.82 < .001 6.76 21.01 
Male, ELL = 0, SpEd = 1, EcDis = 0 7.66 3.22 2.38 0.017 1.34 13.97 
Male, ELL = 0, SpEd = 0, EcDis = 1 10.33 3.07 3.36 0.001 4.31 16.35 
Male, ELL = 1, SpEd = 1, EcDis = 0 5.46 6.53 0.84 0.403 -7.34 18.26 
Male, ELL = 0, SpEd = 1, EcDis = 1 2.70 3.18 0.85 0.396 -3.53 8.94 
Male, ELL = 1, SpEd = 0, EcDis = 1 8.59 3.28 2.62 0.009 2.16 15.01 

School-Level Covariates 
Charter or Magnet Designation -0.24 1.73 -0.14 0.888 -3.63 3.14 
Traditional Middle School 2.89 1.32 2.19 0.029 0.30 5.48 
Percent non-white students 0.07 0.03 2.29 0.022 0.01 0.14 
Percent FRL students -0.09 0.04 -2.64 0.008 -0.16 -0.02 
Locale — Suburban 1.26 2.31 0.54 0.587 -3.27 5.79 
Locale — Rural -0.42 1.56 -0.27 0.790 -3.47 2.64 
Locale — City 0.81 2.70 0.30 0.765 -4.48 6.09 

Intercept 


Intercept | 571.22 3.86 | 147.87 < .001 563.64 578.79 


Note. Stratum 16 and Locale — Town were used as reference groups in the model. 
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Table B.3. HLM Results for Eighth Grade Reading 


Student-Level Covariates 


Treatment Group Membership 5.27 1.87 2.81 0.005 1.60 8.94 
Fall 2018 Reading i-Ready Grand 

Mean Centered 0.77 0.01 | 117.64 < .001 0.76 0.79 

Student-Level Stratum 
Female, ELL = 0, SpEd = 0, EcDis = 0 18.35 4.14 4.44 < .001 10.24 26.45 
Female, ELL = 1, SpEd = 0, EcDis = 0 23.43 4.63 5.06 < .001 14.35 32.51 
Female, ELL = 0, SpEd = 1, EcDis = 0 8.37 4.30 1.95 0.052 -0.06 16.80 
Female, ELL = 0, SpEd = 0, EcDis = 1 16.27 4.07 4.00 < .001 8.30 24.25 
Female, ELL=1,SpEd=1,EcDis=0 | -11.77 11.18 -1.05 0.292  -33.68 10.14 
Female, ELL = 0, SpEd = 1, EcDis = 1 9.06 4.30 2.11 0.035 0.63 17.49 
Female, ELL = 1, SpEd = 0, EcDis = 1 15.56 4.42 3.52 < .001 6.90 24.22 
Female, ELL = 1, SpEd = 1, EcDis = 1 11.16 8.36 1.34 0.182 -5.21 27.54 


Male, ELL = 0, SpEd = 0, EcDis = 0 16.56 4.12 4.02 < .001 8.49 24.64 
Male, ELL = 1, SpEd = 0, EcDis = 0 17.32 4.61 3.76 < .001 8.29 26.35 


Male, ELL = 0, SpEd = 1, EcDis = 0 6.34 4.22 1.50 0.133 -1.94 14.61 
Male, ELL = 0, SpEd = 0, EcDis = 1 12.71 4.07 3.12 0.002 4.73 20.69 
Male, ELL = 1, SpEd = 1, EcDis = 0 1.43 7.35 0.19 0.846 -12.98 15.85 
Male, ELL = 0, SpEd = 1, EcDis = 1 5.84 4.19 1.39 0.164 -2.37 14.05 
Male, ELL = 1, SpEd = 0, EcDis = 1 9.11 4.33 2.10 0.035 0.62 17.60 

School-Level Covariates 
Charter or Magnet Designation 0.22 2.23 0.10 0.923 -4,.16 4.59 
Traditional Middle School -0.06 1.70 -0.04 0.970 -3.41 3.28 
Percent non-white students 0.03 0.04 0.81 0.416 -0.05 0.11 
Percent FRL students -0.10 0.05 -2.15 0.031 -0.19 -0.01 
Locale — Suburban -0.78 2.94 -0.26 0.792 -6.55 4.99 
Locale — Rural -3.32 2.02 -1.65 0.099 -7.27 0.63 
Locale — City -5.56 3.49 -1.60 0.110 | -12.40 1.27 

Intercept 


Intercept | 583.32 4.98 | 117.13 <.001 573.56 593.08 


Note. Stratum 16 and Locale — Town were used as reference groups in the model. 
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Appendix C. Model Assumption Checks 


We examined three model assumptions associated with two-level HLM — residual normality, 
independence, and homoscedasticity — using the MIXED_DX macro in SAS (Bell, Smiley, Ene, 
& Blue, 2014) based on the analytic model for all three grade levels of this study. The 
MIXED_DX macro provides visual output including box-and-whisker plots, histograms, scatter 
plots, and summary tables to examine residual normality, linearity, homoscedasticity, and 
influential outliers. The macro provides this information for level 1 and level 2 residuals. 


We reviewed plots and summary tables at level 1 and level 2 for each grade level. These 
checks provided assurance that our analytic model was appropriate for our data. We examined 
histograms, box and whisker plots, and scatter plots to check residual normality. These plots 
supported that our residuals were generally normally distributed — particularly, the histograms of 
level-2 residuals produced highly symmetrical bell shape with little skewness or kurtosis. The 
level-1 residuals had some skewness but were close enough to normal to allow confidence. 
There was no evidence when examining level 1 residuals of clearly non-normal distributions 
such as a bi-modal distribution. Violation of assumptions of normality of level 1 residuals can 
adversely affect estimation of random effect coefficients and variance-covariance components, 
but typically will not adversely affect estimation of standard errors and, therefore, inferences 
regarding statistical significance. Given the primary purpose of the models was estimating 
treatment effects, the slight lack of normality of the level 1 residuals likely did not have 
implications for the findings presented in this report. 


Scatter plots of predicted values against residuals at level 1 and level 2 clearly illustrated 
random distributions and provided support for that assumptions regarding independence and 
homoscedasticity were not violated. 
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